Analytical approaches for evaluating passive acoustic monitoring data: A case study of avian vocalizations

Abstract The interface between field biology and technology is energizing the collection of vast quantities of environmental data. Passive acoustic monitoring, the use of unattended recording devices to capture environmental sound, is an example where technological advances have facilitated an influx of data that routinely exceeds the capacity for analysis. Computational advances, particularly the integration of machine learning approaches, will support data extraction efforts. However, the analysis and interpretation of these data will require parallel growth in conceptual and technical approaches for data analysis. Here, we use a large hand‐annotated dataset to showcase analysis approaches that will become increasingly useful as datasets grow and data extraction can be partially automated. We propose and demonstrate seven technical approaches for analyzing bioacoustic data. These include the following: (1) generating species lists and descriptions of vocal variation, (2) assessing how abiotic factors (e.g., rain and wind) impact vocalization rates, (3) testing for differences in community vocalization activity across sites and habitat types, (4) quantifying the phenology of vocal activity, (5) testing for spatiotemporal correlations in vocalizations within species, (6) among species, and (7) using rarefaction analysis to quantify diversity and optimize bioacoustic sampling. To demonstrate these approaches, we sampled in 2016 and 2018 and used hand annotations of 129,866 bird vocalizations from two forests in New Hampshire, USA, including sites in the Hubbard Brook Experiment Forest where bioacoustic data could be integrated with more than 50 years of observer‐based avian studies. Acoustic monitoring revealed differences in community patterns in vocalization activity between forests of different ages, as well as between nearby similar watersheds. Of numerous environmental variables that were evaluated, background noise was most clearly related to vocalization rates. The songbird community included one cluster of species where vocalization rates declined as ambient noise increased and another cluster where vocalization rates declined over the nesting season. In some common species, the number of vocalizations produced per day was correlated at scales of up to 15 km. Rarefaction analyses showed that adding sampling sites increased species detections more than adding sampling days. Although our analyses used hand‐annotated data, the methods will extend readily to large‐scale automated detection of vocalization events. Such data are likely to become increasingly available as autonomous recording units become more advanced, affordable, and power efficient. Passive acoustic monitoring with human or automated identification at the species level offers growing potential to complement observer‐based studies of avian ecology.


| INTRODUC TI ON
Ecological insights and informed conservation rely on understanding when and where organisms occur (Fisher et al., 1943;MacArthur, 1984). Ecologists and conservation biologists have used many different approaches to document the distribution of organisms, ranging from detailed observations by skilled field personnel to aerial overflights and analysis of trace environmental DNA (Dejong & Emlen, 1985;Ficetola et al., 2008;Hodgson et al., 2016;Scott et al., 1981).
Technological advances continue to provide new avenues for monitoring habitats, with acoustic analysis rapidly gaining prominence as a powerful method for assessing the distribution and behavior of animals (Sugai et al., 2019;. Passive acoustic monitoring (PAM) is a sampling approach that uses unattended audio recorders to sample sounds over large swaths of space and time (Sugai et al., 2019). Autonomous recording units (ARUs) collect data without the presence of a human observer and provide an enduring record of habitat use, behavioral patterns, phenology, and changes in sound production by wildlife over time (Davis et al., 2017;Desjonquères et al., 2020;Wood et al., 2019a).
Passive acoustic monitoring also facilitates the detection of species that are uncommon, secretive, or occur during seasons, times of day, or weather conditions when human observers are less likely to sample (Sebastián-González et al., 2018). As autonomous recording units become more advanced, affordable, and power efficient, passive acoustic monitoring offers a complementary and non-invasive approach for ecological studies and biodiversity monitoring (Gibb et al., 2019;Potamitis, 2014;Sebastián-González et al., 2018;Sugai et al., 2019;Xie et al., 2018). Furthermore, automated ARUs allow for broader temporal and spatial sampling and minimize the potential for in-field observer bias (Sugai et al., 2019).
Currently, passive acoustic monitoring data are often analyzed by humans who review spectrograms (visual images of acoustic information) and listen to audio recordings to identify species. However, manual annotation is time-consuming and limits the amount of data that can be processed. The ability to survey many more locations for longer periods of time provides crucial data, but also raises new challenges and opportunities in data analysis. Advancements in automation, particularly machine learning approaches, are poised to accelerate and scale annotation dramatically (Kahl et al., 2018;Shiu et al., 2020;Vickers et al., 2019). Advances in data extraction capacity must therefore be met by parallel advancements in methodological frameworks and statistical analysis (Gasc et al., 2017;Gibb et al., 2019;Sebastián-González et al., 2018;Wood et al., 2021).
Much work has been done on the marine soundscape, with a long and robust history of using acoustics to study marine mammals (Lin et al., 2016;Marques et al., 2009Marques et al., , 2011Matthews et al., 2014;Rice et al., 2019). However, the application of passive acoustic monitoring to terrestrial systems is more recent (Sebastián-González et al., 2018;Sugai et al., 2019), with studies utilizing passive acoustic monitoring becoming more widespread in the mid-2000s (Sugai et al., 2019). Although some of the approaches that we consider may be relevant to marine work, our focus here is on soundscape approaches for terrestrial animals that vocalize frequently, such as birds, anurans, and some mammals. Compared to other terrestrial organisms, birds are one of the best-known and best-studied taxonomic groups, with vocalizations that are used in diverse contexts, including territoriality and resource defense, attraction of mates, and alerting other birds to the presence of a predator (Webster & Podos, 2018).
We develop and present methods for the refinement and analysis of acoustic data obtained from passive acoustic monitoring. We ambient noise increased and another cluster where vocalization rates declined over the nesting season. In some common species, the number of vocalizations produced per day was correlated at scales of up to 15 km. Rarefaction analyses showed that adding sampling sites increased species detections more than adding sampling days.

| 3 of 22
SYMES Et al. begin with methods that are currently used with small manually generated datasets but are suitable for expansion to much larger datasets. We then present analytical approaches that are only feasible with large samples in space and time. These approaches include: (1) the generation of species lists (Lellouch et al., 2014;Luther, 2009) and descriptions of vocal variation in traits such as duration and frequency (e.g., Duan et al., 2011;Planqué & Slabbekoorn, 2008;Potamitis, 2014;Towsey et al., 2012;Xie et al., 2018), providing an account of the species detected in a given area and time period. If vocalization rates are measured on a fine-grained scale (e.g., minutes, hours, or days), it becomes possible to estimate, (2) how vocalization rates are impacted by abiotic factors such as precipitation (Bruni et al., 2014;Hasan, 2010;Keast, 1994;Lengagne & Slater, 2002), wind (Hasan, 2010;Lengagne et al., 1999), and temperature (Bruni et al., 2014;Gottlander, 1987;Keast, 1994;Thomas, 1999).
Passive acoustic data can also be combined with information about habitat type and land-use history to produce (3) community patterns in vocalization activity across sites (Depraetere et al., 2012;Gasc et al., 2013;Rodriguez et al., 2014). Detailed data on vocalizations over time also make it possible to quantify the (4) timing of vocal activity, such as changes in acoustic signaling across hours to days (Gasc et al., 2013;Rodriguez et al., 2014;Towsey et al., 2012) or months to years (Towsey et al., 2014). This approach can be used directly to answer research questions, such as whether a warming climate shifts activity dates (Llusia et al., 2013), or to control for the impact of phenology and diurnal patterns on other analyses and comparisons. Large-scale synchronized recording can also provide a novel tool for behavioral research (Tobias et al., 2014). By deploying passive acoustic recorders that record multiple locations simultaneously, it becomes possible to apply statistical tools from population ecology to test for (5) spatiotemporal correlations in vocalizations within species and (6) among species (e.g., Brumm, 2006;Burt & Vehrencamp, 2005;Laiolo et al., 2011;Luther, 2009;Planqué & Slabbekoorn, 2008;Tobias et al., 2014). Finally, (7) Species accumulation functions and optimization of bioacoustic sampling schemes use rarefaction analyses to describe species richness across scales and can aid in planning and designing acoustic sampling schemes (Dixon et al., 2020;Marín-Gómez et al., 2020;Naithani et al., 2018).
To demonstrate these approaches, we used passive acoustic recordings of the dawn birdsong chorus, with manual counts of the number of vocalizations per species per unit time. These approaches provide a package of tools for approaching and interpreting acoustic data to test ecological hypotheses and assess biodiversity across space and time.

| Study sites
Acoustic sampling was conducted in hardwood forests at 500-800 m elevation in the White Mountains of New Hampshire, USA.
Study sites (Figure 1, Table S1) were located in the Hubbard Brook Experimental Forest and in a similar habitat within the Jeffers Brook Forest, approximately 15 km from Hubbard Brook (Table   S1). Hubbard Brook Experimental Forest was established in 1955 with a focus on hydrologic and forest science (Holmes & Likens, 2016), and studies of avian ecology have been running continuously since 1969 (e.g., Holmes, 2011;Holmes et al., 1979;Holmes & Sherry, 2001;Holmes & Sturges, 1975;Townsend et al., 2013).
The forests at both Hubbard Brook and Jeffers Brook consist of variably aged second growth northern hardwoods, dominated by sugar maple (Acer saccharum), American beech (Fagus grandifolia) and yellow birch (Betula alleghaniensis), with occasional white ash (Fraxinus americana), white birch (Betula papyrifera), red spruce (Picea rubens), and balsam fir (Abies balsamea) (Campbell et al., 2007). The understory included saplings of the canopy species (especially American beech) as well as patches of hobblebush (Viburnum lantanoides) and occasional striped maple (Acer pensylvanicum). Both sites contained relatively mature forests (last harvested in 1910-1915) and middle-aged stands (clearcut in 1970-1975) (Goswami et al., 2018). In mid-aged stands compared F I G U R E 1 Relative positions of audio recorders within study sites in Jeffers Brook and Hubbard Brook watersheds, NH. Coordinates and site characteristics are in Table 1. Base images are from Google Earth to mature stands, the diameters of the largest trees were smaller (<40 versus up to ~50 cm diameter), but trees per hectare was higher with the result that above-ground biomass was similar (basal area = 25-35 m 2 /ha). Bird vocalizations from the two ages of forests (hereafter "mature" and "mid-aged," respectively) were sampled at both sites to compare acoustic samples from nearby forests of different successional stages.

| Data collection and recording hardware
This paper contains two acoustic datasets, one collected in 2016 and the other in 2018. In 2016, we conducted sampling to compare avian vocalizations in mid-aged and mature forest stands, replicated across Hubbard Brook and Jeffers Brook watersheds.
In 2018, sampling was concentrated in Hubbard Brook forest and designed to provide high resolution within one forest area, allowing for more detailed examination of spatial patterning in vocalization activity. In both years, recorders were activated each morning for a 10-minute period spanning 06:20-06:30 local time (UTC-4).
Depending on the date, the recordings started at 55-75 min after sunrise. The 10-min interval for recording bouts parallels a common point count duration (Buskirk & McDonald, 1995) and was chosen to be long enough to capture most species vocalizing at that site on that morning, but short enough that we could still annotate many different mornings and compare inter-and intraspecific patterning of vocalizations among days (Tobias et al., 2014).
For annotation, we chose a sample size of 20 dates per year as being both sufficient and manageable. We selected the dates within years such that they were distributed throughout the period of available recordings, but with longer intervals between dates later in the season when there were generally fewer vocalizations.
Dates for annotation were chosen in advance of examining the sound recordings and so were not biased with respect to vocalization activity or ambient sound levels.
We used Olympus DS-40 recorders (Olympus, Center Valley, PA, USA) deployed in plastic boxes and connected to their original microphone by a 1-m extender cable. Each microphone was placed at a height of 2 m and was suspended below a fabric rain shield 25 cm in diameter. The recorders generated MP3 files with a sampling rate of 44.1 kHz on the "high-quality" setting, with the manufacturer's maximum microphone sensitivity, no frequency filter, and no variable control voice actuator. The MP3 files were converted to 16 bit WAV files using Switch Plus converter (NCH software, Canberra, Australia) so that recordings could be digitally analyzed and manipulated. The compressed MP3 format discards some high-frequency information, resulting in smaller files, but lower acoustic resolution, particularly at frequencies higher than those used by most bird species. These missing data are not recovered with the conversion to WAV format.
In 2016, three recorders were in the mature forest and two were in mid-aged forest in both the Hubbard and Jeffers Brook forests. Within watersheds, recorders were separated by  In 2018, we sampled vocalizations only within Hubbard Brook Experimental Forest. The 10 recorders were distributed across an area that has been the focus of long-term studies of breeding songbirds (Holmes et al., 1986;Holmes, 2011;Rodenhouse & Holmes, 1992;Townsend et al., 2013). Distances between recorders ranged

| Data selection and annotation
Recordings were annotated with species names by listening to sound recordings and looking at spectrograms (visual and auditory review). To review recordings visually we used the spectrogram view in the sound analysis software RavenPro (version 1.5.0 Build 43 for Windows, 2017). The DFT size was 512 samples with an overlap of 50%, giving a resolution of ± 256 Hz. The spectrogram of each recording was viewed in a standard gamma II color scheme with a power threshold floor setting of 56 dB, although it should be noted that these recordings are not calibrated and this dB value is relative to the arbitrary reference value of the Raven software. For recordings with high background noise, the floor threshold was gradually raised to diminish noise and highlight avian acoustic communication. All reviews of the spectrograms and sound were conducted by one of us (KDK) who was experienced with the vocalizations of this bird community. Noisecancelling over-ear headphones were used during review. Bird vocalizations, consisting of songs and calls, were identified to species. We tallied only vocalizations with a recognizable spectrogram that was clearly distinguishable by eye and ear from background noise. Two 10-minute samples in 2018 occurred during substantial rain and were excluded from further analysis (see Approach 2 for additional details on quantifying and addressing sound from rain and wind). There were occasional high amplitude vocalizations that exceeded the sensitivity scale of the maximum amplitude that the recording system could record accurately (a phenomenon known as "clipping"), but clipped vocalizations could still be identified to species. The sound recordings were reviewed in random order to limit effects from listener bias and listener learning. The complete species annotations for all 410 10-min recordings are depos in (Symes et al., 2021).
Each 10-min recording was analyzed by counting the number of vocalizations (calls and songs) of each species present in the recording. Our objective was to recognize species and quantify their vocalization activity, so we did not attempt to distinguish between songs and calls, but we tested for the uniformity of vocalizations within species.
The duration and structure of bird vocalizations varied among species. For example, Red-eyed Vireos (Vireo olivaveus) had a short but repeated song that included two elements over only about 700 ms. Black-throated Blue Warblers had a song with several elements over about 1.5 s, whereas Winter Wrens (Troglodytes hiemalis) had songs of 5-10 s that consisted of a dozen or more elements per second. Operationally, we defined a vocalization event as an acoustic element separated from others by a pause of more than one second. This was partly subjective, so we characterized our operational definitions with sample sound recordings, associated spectrograms, and statistical analyses of mean duration and dominant frequency.
Occasional incomplete vocalizations were still scored as one vocalization if they could be identified to species. Vocalizations of two mammals, red squirrels (Tamiasciurus hudsonicus) and Eastern chipmunks (Tamias striatus), were also recorded in the annotations of sound recordings from 2018.

| Repeatability of annotation
At the end of the annotation process, the same observer (KDK) blindly re-annotated thirteen randomly selected 10-min recordings. This allowed us to assess the consistency of the species lists and call counts.

| Data distributions, transformations, and analyses
Species-specific vocalization rates (number per 10-min recording) had frequency distributions that were skewed toward higher vocalization rates (approximated gamma distributions). These distributions were well-normalized with a square root transformation, which facilitated statistical analyses. The analyses are summarized in Table S2.

| Species lists and descriptions of vocal variation
Determining the list of species present in a site is among the most basic uses of annotated data and underlies many management and conservation decisions. We used the annotated data to generate an overall species list for Hubbard Brook that we could compare to decades of field observations from this well-studied bird community .
We quantified species-specific patterns of vocalizations for the songs of the eight most common songbird species and compared them to determine how species were differentiated by duration and frequency. To select vocalizations for analysis, we isolated and ana- 2.5.2 | Relationships among environmental variables, vocalization activity, and acoustic detection Environmental variables such as rain, wind, cloud cover, barometric pressure, and temperature can impact avian physiology and behavior, as well as signal transmission and the probability of detecting a vocalization on a recording unit (Bruni et al., 2014).
Understanding the interaction between vocalization and abiotic factors inform natural history (Bruni et al., 2014;Lengagne & Slater, 2002) and can have value for identifying the habitats and sampling windows that will be most valuable for observer-based fieldwork.
We analyzed weather data that were collected in Hubbard Brook

Quantification of rain and wind intensity
Rain and wind pose particular challenges for bioacoustics because they can affect both the rate of signaling in animals and the detectability of signals. We employed multiple approaches to quantifying background sound pressure from rain, wind, and water drops. The first approach was to reference hourly data from nearby weather stations (see above). A second approach was human review and evaluation of acoustic signatures associated with rain and wind (Towsey et al., 2012). Sound from water drops was ranked on a five-tier scale by listening to the recordings and visualizing spectrograms: absence (0), drizzle to light (1), moderate and constant (2), hard rain (3), and very hard rain (4) ( Figure S1). Wind, which tended to be audible but with a broadband spectral contribution below the visualization threshold, was ranked on a three-tier auditory scale: absence (0), soft (1), or hard (2).
We also employed two statistical assessments of ambient sound pressure as recorded in the wave files: A manual and A automated .
Amplitude-calibrated equipment is currently rare in terrestrial PAM. Our equipment was not amplitude calibrated and consequently, the measurements are proportional to sound pressure levels but cannot be represented as absolute values. Our calculations assumed that microphone sensitivity was approximately equal across recorders and across the duration of the recording season (verified by recording the same tone series using all recorders at the beginning and end of the seasons). For the calculation of A manual , we randomly selected 42 audio recordings from 2016 and manually identified (using the spectrogram view in RavenPro) a sample of one-second sound snips without bird vocalizations from each recordings. To do so, we selected a random second within each minute and manually moved forward in the recording (by auditory review and examination of spectrograms) to locate the next one-second interval that did not contain bird vocalizations, allowing us to sample the background throughout the recording.
Sometimes, there were no one-second intervals within the next minute without bird vocalizations. In those cases, we moved forward to the following minute. From 42 10-minute sound files, we We calculated correlations across dates among all pairs of environmental factors (temperature, wind, ambient sound, etc.) and between environmental factors and bird vocalization rates.

| Community patterns in vocalization activity
Passive acoustic monitoring is well-suited for revealing how species are associated with different habitats. Often, habitat affinity is described at a coarse scale (e.g., old growth forest, marshlands), with conservation decisions following comparably broad classes. But there can be substantial heterogeneity within recognized habitats due to, for example, diverse plant communities, topography, and proximity to water. Understanding where species spend time within preferred habitat types can help to identify and protect the most valuable areas within critical habitats.
We evaluated patterns in vocalization rates across two habitat types using our site comparison dataset, collected in 2016. We employed an ANOVA that included forest type (mature and mid-aged), watershed (Jeffers Brook and Hubbard Brook), forest type × watershed, and date as fixed effects. To avoid concerns regarding spatial independence of recorders (Hurlbert, 1984), the data frame for the ANOVA was the average on each sample day of the 2-3 recorders within each forest type × watershed. Vocalizations per 10 min were square-root transformed prior to analysis, which satisfied assumptions of homoscedasticity. Visual examination revealed no temporal autocorrelation to residuals.
Repeated sampling across 19 dates in 2018 permitted the construction of species accumulation curves to evaluate the completeness of species detections at each sampling location (see Approach 7).

| Timing of vocal activity
In mid-to high-latitude systems, the annual timing of breeding events by birds can vary from year to year. For example, at Hubbard Brook, the annual variation in the initiation of first clutches by Black-throated Blue Warblers varied by 20 days across 25 years (Lany et al., 2016). The annual timing of vocal activity would also be expected to vary among years, but data are limited (Buxton et al., 2016;Furnas & McGrann, 2018). The phenology of vocalization activity could be informative with respect to behavior, reproduction, and climatic patterns, and other environmental conditions. For example, the number of days of singing per season could be relatively constant from year-to-year or might vary depending on environmental conditions that influence the number and timing of clutches.
To evaluate phenological patterns, we plotted vocalization rates for each species by date across the breeding season.
Weather tends to co-vary over relatively large spatial scales, whereas social environments and predators tend to be more local. Therefore, we propose and test the hypothesis that weather will generate spatial correlations in day-to-day vocalization activity at the scale of tens of kilometers, whereas social interactions and predators predict correlations at the scale of hundreds of meters. We tested these predictions with analyses of spatial correlations in day-to-day vocali- where r = correlation coefficient and n = sample size (Neter et al., 1985).
With the data from 2018, where we had a range of inter-recorder distances, we were able to calculate continuous spline correlograms with 95% confidence intervals using the R package ncf (Bjornstad & Bjornstad, 2020). For these correlograms, we added data points representing the Pearson correlation coefficients vs. distance for all pairs of recorders; these data points were not independent because 10 recorders yielded 45 pairs. While these points did not influence the correlograms or the confidence intervals produced by ncf, they were plotted to facilitate data visualization. We omitted two dates with heavy rain and very low vocalizations ( Figure 4)  This randomization was repeated 1000 times for each species pair to generate a distribution of correlation coefficients. We then compared the correlation coefficients of the actual data to the distribution of coefficients from the date-randomized data to search for pairs of species that were more or less correlated than would be expected by chance.
To explore for natural groupings among species in vocalization behavior, we also used a principal components analysis to evaluate the correlation matrix of interspecies vocalization rates (square roottransformed rates; rows as dates and columns as species). We then tested for correlations between the resulting principal components and environmental variables.

| Species accumulation and optimization of bioacoustic sampling schemes
We analyzed our 2018 data using EstimateS 9.1.0 Biodiversity Estimation Software (Colwell, 2013) to (1)

| Species lists and descriptions of vocal variation
Our sample of 410 10-min sound recordings included 129,866 vocalizations from 44 bird species and two mammal species (Table 1) Vocalizations were relatively stereotypical within species ( Figure   S2), including in duration, frequency, and pattern ( Table 2). The longest vocalizations were produced by Winter Wrens (mean = 6.2 s).
The remaining species had vocalizations ranging from 0.6 s (Red-eyed Vireo) to 2.7 s (Ovenbird; Seiurus aurocapilla). The peak frequency of vocalizations ranged from 2.9 to 5.5 kHz (Swainson's Thrush There were no cases where a species was added or lost in a blind second annotation and there was high repeatability in the counts of vocalizations per species per recordings (r 2 = 0.92 to 0.99; n = 13, depending on the species; Figure S3). The modest differences between replicate counts from the same sound recordings resulted from low amplitude vocalizations that were on the edge of detectability and were counted in one sample but not the other.

| How vocalization rates are impacted by abiotic factors
Air temperatures during our acoustic sampling ranged from 7 to 20°C (mean ± SD = 15 ± 4°C;  Figure S1), apparently the result of condensation in the canopy. Such dripping was more pronounced when temperatures approached the dewpoint and when there was wind.
The sound pressure in randomly chosen seconds when no birds were vocalizing varied among sampling occasions and was well correlated with automated quantification of ambient sound pressure (Figure 2).
The divergence between the metrics at high sound pressure levels likely reflects the fact when focal seconds were chosen, we selected only quiet seconds (up to 10, but often fewer). The automated analysis identified the quietest 10th percentile, which may still contain some acoustic events when recordings had substantial acoustic ac-

TA B L E 2
Attributes of vocalizations of eight common species of breeding songbirds at Hubbard Brook. See Table 1 for full species names more background sound pressure than the quietest day; the other 19 days differed by no more than 4-fold in background sound pressure ( Figure S4).
The number of bird vocalizations from all species that was detected per recorder per 10-min sampling occasion was negatively correlated in both years with a set of intercorrelated variables related to ambient sound: wind speed, wind sounds, and dripping sounds (Tables 3 and 4). Overall vocalization activity per day was negatively related in both years to wind speed and ambient sound (Table 5). There was a weaker positive association with barometric pressure (significant in 2016 but not 2018). For the most common species, vocalization activity and date were often correlated, but this direction was not consistent between years for all common species except Ovenbirds (Table 5). There were some additional correlations between vocalization activity and environmental variables, but other than relations with wind speed and ambient sound they were infrequent and inconsistent (Table 5).

| Community patterns in vocalization activity
Most bird species occurred in recordings from both Hubbard Brook and Jeffers Brook, but some species were detected primarily at one location ( Figure 3, Table S4). There were clear associations between species and forest age, with Ovenbirds and Winter Wrens detected at higher rates in mature forests, while American Redstarts were more commonly detected in middle-aged forest (Figure 3, Table S4).

| Timing of vocalization activity
Data from 2016 and 2018 were recorded in different nearby locations. In both years, most of the bird species were conspicuously vocal throughout our sampling window of 6-8 weeks, but species-specific vocalization rates frequently varied by two-fold or more among mornings separated by just a few days (Figure 4). In 2016, fluctuations in daily vocalization rates activity were quite concordant between Hubbard Brook and Jeffers Brook in 2016 (left-hand column in Figure 4). In 2018, the earlier recordings captured comparatively high activity from Black-throated Blue warblers, Black-throated Green Warblers, and Ovenbirds, and comparatively low activity from Red-eyed Vireos.

| Correlations within species in vocalization activity
To further evaluate intraspecific spatial correlation in vocal activity, we tested for correlated vocalization dynamics both between habitats (forest age) within a site and between watersheds Examination of georeferenced animations of daily vocalization rates across the study area (Appendix S1) suggested modest spatial correlation that depended on the species. Further resolution was permitted by spatial correlograms (Figure 6). Similar to the 2016 data, there was evidence of spatial correlations in vocalization rates, with Ovenbirds again showing particularly strongly correlated dynamics.
However, with the possible exception of Red-eyed Vireos, there was little evidence for elevated correlations among nearby locations.

| Correlations among species in vocalization activity
Many species pairs had correlated peaks and troughs in daily vocalization rates ( Table 6). The 66 pairwise correlations were disproportionately positive (45 correlations were positive, 10 were significant;

F I G U R E 2
Comparison of estimates of ambient sound pressure from (1) manually identified seconds with no bird vocalizations (y-axis) to (2) automated analysis of ambient sound pressure (x-axis).
Dashes represent the line of equality. Units are log 10 (RMS) 21 correlations were negative, none significant). The overall mean correlation was r = .14 with a SD = 0.29. There were two clusters of covarying species, and these tended to be negatively correlated with each other (note the positive correlations in the upper left and lower right of

| Species accumulation and optimization of bioacoustic sampling schemes
The average number of species detected in one 10-minute sample at one recording location was 5-6 species (α-diversity, Figure 8). The number of new detections at an average location increased to about 13 species with 7 days of sampling (β-diversity, Figure 8). The expected total number of species detections increased from about 20 to 25 species if 19 ten-min samples were drawn from 10 locations vs.
all from the same location (γ-diversity in Figure 8).
As anticipated, more species were detected when we sampled additional days and included additional recording sites within the habitat matrix ( Figures S4 and S5). However, adding locations resulted in more species per unit of analysis effort than did adding more days (Figure 8). In our study system, the expected species detection curve saturates at 30 species when sampling in one location and at 41 species when the same total sampling time was distributed across ten locations (Figure 8).

| DISCUSS ION
Our examination of bird vocalization patterns from multiple sites, years, and recording units generated knowledge of the study system, as well as insights regarding methods. The species lists from acoustic data (Table 1) were largely congruent with decades of observer-based field studies (see Holmes & Likens, 2016). For example, the five species responsible for 79-89% of all recorded vocalizations are the most abundant breeding birds in this location and the rest of our species list (Table 1)  there was strong alignment between the calls measured here and previously published measurements (Rivers & Kroodsma, 2000).

| Environmental variables and ambient sound
We evaluated multiple environmental variables for their relations with vocalization activity. The literature includes numerous reports of TA B L E 5 Correlations between vocalization rates of individual songbird species and environmental variables (N = 20 dates for 2016, N = 19 dates for 2018). Vocalization rates are the average for each date of square-root transformed rates for all recorders at Hubbard Brook (5 in 2016 and 10 in 2018). Analyses for 2018 exclude two rainy dates. See Tables 3 and 4    such relationships. For example, sunlight (Miller, 2006;Thomas et al., 2002), moonlight (York et al., 2014), temperature (Garson & Hunter, 1979;Gottlander, 1987;Thomas, 1999), and atmospheric pressure (Prevost, 2016) can affect signaling activity. Wind and rain can interact to affect vocalization activity in a variety of bird species, ranging from King Penguins to Grasshopper Sparrows (Lengagne et al., 1999;Lenske & La, 2014;Prevost, 2016). In our studies, the only environmental variable with notable effects was background sound, which was primarily due to the sound of water dripping from the canopy.

for correlations among environmental variables
Establishing statistical criteria for identifying recordings with high ambient sound ("noise" from the perspective of animal vocalizations) provides an objective way to filter recordings for analysis and identify how species respond to background noise. The automated approach to assessing ambient sound provided results that were highly correlated with the manual approach ( Figure 2) and can readily be applied to large numbers of sound recordings. In our study system, some bird species showed greater declines in vocalization when ambient sound was relatively high (

| Variation in signaling activity across space and time
In the third approach, we compared vocal activity in two watersheds with similar forest composition and land use history. Despite the proximity and similarity of these sites, several bird species were common in one site and rare or absent in the other (Figure 3).
These findings underscore the value of replicating locations within Signaling is risky, time consuming, and metabolically costly (Falk et al., 2015;Godin & McDonough, 2003;Prestwich & Walker, 1981;Symes et al., 2015;Taigen & Wells, 1985). Although the characteristics of the individual signals have been well-studied, much less is known about how singing activity varies from day-to-day or season-to season. Additional sampling would be required to test for the stability in our study system of seasonal timing across years (Method 4).
Most of the species that we studied displayed conspicuous peaks and troughs in day-to-day vocalization rates (at one hour post-dawn) that were only weakly explained by abiotic factors (Table 3). This suggests a prominent role for behavioral ecology in understanding the patterning of bird vocalization rates. Possible predictors include the timing of territory establishment, nest-building, egg-laying, incubation, hatching, and fledging. The behavior of birds on neighboring territories could also be a factor (Sillett et al., 2004). Better understanding of the relationship between reproductive behavior and signaling rate can inform acoustic sampling strategies and expand interpretations of acoustic data from times and places when observer-based sampling is constrained, such as on military ranges or during the Covid-19 pandemic. At longer time scales, advances in the analysis and interpretation of acoustic data can expand studies of seasonal and interannual phenology, particularly to times of year when it is difficult for researchers to access sites due to environmental conditions or competing professional demands (Marra et al., 2015). Recording devices that are timed to automatically start before breeding activity begins will make it more feasible to determine when migratory birds arrive and can spare field biologists the sometimes frustrating work of sampling to determine that the season has not yet begun.

| Intraspecific synchrony in signaling
By combining data from multiple recording units, large synchronous acoustic datasets open a frontier of opportunities for studying behavior across landscapes (Valcu & Kempenaers, 2010). While presence/absence data are ecologically informative , counts of individual vocalizations can provide additional insight into behavior. We employed statistical approaches from population ecology (e.g., spline correlograms) that were developed to test for correlated dynamics in abundance and are readily extended to vocalization activity (Method 5). In our study system, the daily vocalization rates of some common species rose and fell synchronously between sites separated by 15 km (Figure 5), but not necessarily more so at 200 m than at 1500 m ( Figure 6). Weather only explained a modest fraction of the spatiotemporal correlations. Other hypotheses include food availability, concordant endogenous rhythms in reproduction, or neighbor effects on behavior that extend a surprisingly long way. Playback experiments could be a powerful tool for testing hypotheses drawn from behavioral ecology (Greenfield, 1988).

| Interspecific synchrony in signaling
Our analyses of interspecific correlations (Approach 6) suggested the existence of functional groups of species based on vocalization F I G U R E 6 Intraspecific correlations in vocalization rates of four songbird species as a function of distance between recording locations (n = 19 dates × 10 locations in Hubbard Brook in 2018).

| Synthesis and conclusions
There is a natural match between data from passive acoustic monitoring and the classical concepts from community ecology of α-, β-, and γ-diversity ( Figure 8). These diversity metrics provide a framework for comparing communities of birds and other vocalizing animals, such as assessing how similar acoustic data from a tropical rainforest would compare with respect to α-, β-, and γ-diversity to similar acoustic data from migrant songbird communities in northtemperate forests. Such comparisons could address general ecological theory, contribute to biodiversity assessments, and contribute to optimization of sampling strategies. For example, the benefits for biodiversity assessments of adding days vs. locations from acoustic sampling will depend on the relative strength of β-and γ-diversity ( Figure 9). Presumably, β-diversity will be related to the strength of seasonality and the concordance between co-occurring species in the timing of their breeding. High γ-diversity suggests that conservation efforts should consider relatively large management units, perhaps partly because of ecologically important variation within what appear to be uniform habitat types. The extrapolation of species accumulation curves (rarefaction) provides a tool for judging when the species composition of a community is well known vs.
Passive acoustic monitoring in well-studied sites such as the  Table 2 strong foundation from observer-based ecological research. Having a family of technical approaches for bioacoustic data will leverage the growing power of automated data extraction, but remain critically intertwined with observer-based field biology. Acoustic data can reveal trends, interactions, and seasonal patterns in sound pressure, but direct observations and experiments will remain crucial for asking questions, interpreting data, testing hypotheses, and developing general theory. The combination of direct observations and passive recordings offers general opportunities for understanding birds and other acoustically active organisms.
The approaches described in this paper provide tools for employing acoustic data to address basic and applied questions regarding the nature of biological communities. Passive acoustic monitoring opens sampling strategies that have historically been difficult for human observers, including collecting replicated synchronous data at many sample locations. Over time, archived recordings will continue to provide a source of data that can be mined with increasingly sophisticated detection algorithms and statistical analyses. Expanded capacity for recording and analysis is fueling growth in occupancy analysis and density estimation (Furnas & Callas, 2015;Prevost, 2016; Sebastián-González F I G U R E 8 Relationship between total number of bird species detected via passive acoustic monitoring and the number of recorder days that were analyzed, as measured at Hubbard Brook. The blue curve indicates the average species accumulation for a single location. The black curve indicates the species accumulation for an equal number of samples randomly drawn from any of the 10 locations F I G U R E 9 Total number of bird species detected with simulated bioacoustic sampling from different numbers of locations and sampling occasions. The boundary of the yellow and blue indicates possible combinations with analyses of 190 total sound files, as in our study; our sampling design is (10 locations × 19 occasions) is indicated at the far end of the boundary. Inset at right shows top view of same. The surface and points show averages of 1000 replicate samples drawn at random from a simulated data set modeled after our data (Appendix S1). The pattern shows that adding locations generally added to total species detections more than adding occasions et al., 2018), which are burgeoning fields of growing value to population ecology and conservation biology (Marques et al., 2013). Some of the approaches described here are directly related to occupancy analysis (especially approach #7, using rarefaction analysis to quantify diversity and optimize bioacoustic sampling schemes). Other approaches make use of information in vocalization rates that goes beyond presenceabsence data (especially approaches 4-6). Recent advances in recording hardware and continuing advances in data extraction software are permitting unprecedented access to acoustic data over space and time Shiu et al., 2020;Vickers et al., 2019). It seems likely that our ability to collect acoustic data will continue to exceed our capacity for analysis and interpretation. This places a premium on being strategic in framing questions, choosing hand annotation subsets, and designing analyses to evaluate acoustic data. The approaches presented in this paper provide some guideposts for analyzing, interpreting, and applying the influx of acoustic data.
Historically, soundscape analysis has often relied on assessing statistical signatures in data to understand ecological dynamics and patterns in biodiversity (Gottesman et al., 2020;Pieretti et al., 2011;Sueur et al., 2008Sueur et al., , 2014. Incorporating detailed information about species composition and signaling rate will inform our interpretation of the patterns seen in ecoacoustic data and will enhance our ability to understanding how the acoustic signatures of environments relate to the underlying biological and ecological dynamics. Ecosystem Study and associated long-term datasets. Russ Charif provided exceptionally useful feedback on the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interest.